Using Multiple Metrics in Automatically Building Turkish Paraphrase Corpus
نویسندگان
چکیده
منابع مشابه
Using Multiple Metrics in Automatically Building Turkish Paraphrase Corpus
Paraphrasing is expressing similar meanings with different words in different order. In this sense it is viewed as translation in the same language. It is an important issue in natural language processing for automatic machine translation, question answering, text summarization and language generation. Studies in paraphrasing can be classified as paraphrase extraction, paraphrase generation, pa...
متن کاملTurkish Paraphrase Corpus
Paraphrases are alternative syntactic forms in the same language expressing the same semantic content. Speakers of all languages are inherently familiar with paraphrases at different levels of granularity (lexical, phrasal, and sentential). For quite some time, the concept of paraphrasing is getting a growing attention by the research community and its potential use in several natural language ...
متن کاملBuilding a Paraphrase Corpus for Speech Translation
When a machine translation (MT) system receives input sentences of spoken language, the following two types of sentences are difficult to translate: (1) long sentences and (2) sentences having redundant expressions often seen in spoken language. To reduce these difficulties, we are developing methods to paraphrase input sentences into more translatable ones. In this paper, we report a prelimina...
متن کاملBuilding a Non-Trivial Paraphrase Corpus Using Multiple Machine Translation Systems
We propose a novel sentential paraphrase acquisition method. To build a wellbalanced corpus for Paraphrase Identification, we especially focus on acquiring both non-trivial positive and negative instances. We use multiple machine translation systems to generate positive candidates and a monolingual corpus to extract negative candidates. To collect nontrivial instances, the candidates are unifor...
متن کاملBuilding a Swedish-Turkish Parallel Corpus
We present a SwedishTurkish Parallel Corpus aimed to be used in linguistic research, teaching, and applications in natural language processing, primarily machine translation. The corpus being under development is built by using a Basic LAnguage Resource Kit (BLARK) for the two languages which is then used in the automatic alignment phase to improve alignment accuracy. The corpus is balanced wi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Research in Computing Science
سال: 2016
ISSN: 1870-4069
DOI: 10.13053/rcs-117-1-6